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I Microcircuit means for implementation of 
RSA encryption and decryption transformations 
are described, which consists of a plurality of 
units, to which external adders are attached, 
thereby incorporating the RSA function. 
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Ffeid of The Invention 

This invention relates to microcirajit means for 
implementation of RSA encryption and decryption 
transfomnations and other crytographic large operand 
operations. 

BACKGROUND OF THE INVENTION 

The algoritftm wtiich ttie invention is intended to 
implen^nt is known in the art and was described in: 
R.V. Rivest, A. Shamir and L Adieman. "A Method for 
Obtaining Digital Signatures and Public Key Cryp- 
tosy^ems". Comm. ACM, vol. 21. pp. 120-126, 1978. 
Its implementation involves exponentiation and mod- 
ulo reduction, as indicated by the formula: 
MErTKxiN==C and also many other large operand 
algorithms suggested in the art and by standardi- 
zation organizations, which necessitate conventional 
and modular arfthmetic. The arithmetic process invol- 
ved is modulo multiplication, which requires addition, 
substraction and shifting. 

Fig. 1, attached hereto, illustrates ttie flow chart 
of the algorithm for operands which are no larger than 
the registers. Therein M indicates the message. N the 
modulo, and E ttie exponentiation key. The implemen- 
tation of tiie algorithm essentially requires six regis- 
ters. Since the data block, the exponentiation and 
modulo reduction whereof are to be carried out. may 
comprise a large number of bits, e.g. several hun- 
dreds, the microcircuits required to implement it in the 
way practiced in ttie prior art may become quite 
expensive, t)ecause the registers l)ecome accord- 
ingly larger and require relatively large silicon sur- 
faces, ttie cost of which, as well known to persons 
skilled in the art. increases dramatically with the 
increase in size. 

The circuit also enables ttie encryption and dec- 
ryption of messages whose size is larger than that of 
the registers. This is to be done by an interieaved 
Montgomery reduction, as known in the art p. L. 
Montgomery. "Modular Multiplication Wittiout Trial 
Division". Mathematics of Computation, vol. 44, pp. 
519-521, 1985.1 and/or by utilizing a novel approach 
to processing of double preciston operands by sangle 
precisk)n hardware. In particular, the circuit naturally 
enables ttie calculation of J = -N-i mod 2*-. where N is 
the modulus and L is the word size. The value of J is 
necessary for divisionless modular exponentiation as 
known to experts in ttie art Also in particular, the cir- 
cuit enables a double precision divisk)n. 



Particularfy. it is a purpose of the invention to pro- 
vide microcircuit means of ttie kind described which 
are compact in size and are at>le to accomodate targe 
data blocks without excessively increased costs. 

5 It is a forther object of the invention to provide 

such microcircuit means which are more eoonomical 
and convenient to manufacture than those avaOable 
in the prior art 

It is a further object of the invention to provide 

10 means for using larger keys than those specified by 
the registers sizes, by enat>ling a natural implemen- 
tation of the Montgomery algorithm or a double pre- 
cision modular arithmetic without adding a large 
amount of hardware. 

15 These and other objects of ttie Invention will 
become dear as ttie description proceeds. 

The microcircuit according to the invention ts 
characterized in ttiat it consists of a plurality of iden- 
tical cells, to which extra unifying hardware, in the 

20 fomn of adders, is attached, altogether implementing 
ttie RSA function. Accordingly, each cell comprises a 
1-bit slice of ttie required registers. 

Microcircuit means according to the invention 
may be constructed from a plurality of cells connected 

25 in series with the required adders, which, by using tail- 
ored switching circuitry, afford a plurality of arithmetic 
operations. 

In a form of the invention, secondary nKxiular 
units are provided, which comprise a number of cells. 
30 e. g. 4 cells, witti the required common adder or 
adders, ttiereby originating a 4^bit secondary modular 
unit A number of these latter may be likewise com- 
bined, with the required common adder or adders, to 
provide a tertiary, e. g. 16-bit (modular) unit In gen- 
35 eral, a mkrodrcuit according to ttie invention may 
comprise N units, each comprising n primary units and 
being ttierefore an n-bit unit, wh^e "n" is preferably a 
nrxjttipleof4. 

The implementation of the algorithm requires, in 
40 addition to ttie registers, 1, 2 or 3 suitable adders. 
Secondary and tertiary or even larger modular units or 
cells may comprise a single adder or a plurality 
thereof, these altematives giving rise to a variety of 
embodiments of the invention. CtA (Carry Look 
45 Ahead) adders are preferat)ly used and reference ts 
made to them hereinafter; however, any other type of 
adder might be used instead. 

The invention will t>e better understood from the 
description of a number of emt)odiments. with refer- 
so ence to the drawings, wherein: 

Brief Description of The Drawings 

Fig. 1 is a flow chart of the modular exponentia- 
tk>n algorithm which the invention implements: as 
M%nod(N); 

Fig. 2 is a flow chart illustrating nKxiular multipli- 
cation (M*C)mod(N) using one adder 



SUMMARY OF THE IMVENTION 

It is a purpose of the inventk)n to provide micro- 55 
circuitmeans. for the implementation of the algorithm 
described, whfch are free from the disadvantages of 
the prior art 
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Rg. 3 is a like flow chart iQi^trating the same mod- 
ular multiplication (M*C>nod(N). but using tvra or 
three adders. 

Rg. 4 schematically aiustrates a cell according to 
an embodiment of the invention. 
Rg. 5 schematically Slustrates a cell according to 
an embodiment of the invention, in which embo- 
diment one adder is used. 
Rg. 6 is a complete L-bft lor^ block diagram 
indwjffig one adder. 

Rg. 7 schematicaOy aiustrates a cell according to 
an embodinf^ in which one CLA adder is used. 
Rg. 8 illustrates a 4-bit CLA secondary nKxfular 
unit 

Rg. 9 illustrates a 4-bit nrKXIular unit including 4 
bask: cells and a 4-bit CLA. 
Rg.10 illustrates a 512-bit long unit 
Rg. 11 aiustrates a cell according to an embodi- 
mevX in which two CLA adders are i^ed. 
Rg. 12 is a complete L-bit long block diagram of 
a unit comprising two adders. 
Rg. 13 illustrates a cell according to an embodi- 
ment in which three CLA adders are used. 
Rg. 14 is a complete L-bit long block diagram of 
a unit comprising three adders. 
Rg. 15 descrit>es tt^e configuratk)n of the circuit 
as a multiplier. 

Rg. 16 describes the way the craiit calculates J 
= mod 2S where N is the modulus and L is 
the word size. 

Fig. 17 descrit>es the configuration of the circuit 
as a drvisk>n drcuit 

Detailed Description of Preferred Embodiments 

In the embodiments described, the microcircuit 
according to the inventk>n which constitutes the mod- 
ular exponentiator includes six shift registers, which 
are assunned to be hundreds of bits long, and the 
adders. The six registers are: 

1 ) E register - Cydic Left shifting register for stor- 
ing and shifting the d/encryption key E. 

2) M register - Cydic Right shifting register for 
storing and shifting the nmssage M. 

3) C register- Right shifting register for storing the 
multiplicative accumulator. 

4) S register- Right shifting register for storing the 
adder's result 

5) B register - Left shifting register for holding the 
multiplk:and. 

6) N register- Right shifting register for storing the 
inverted nuxiuk) numt>er N. 

The adders perform the following additions: 

1) S+B 

2) S-N 

3) 2B-N 

Two counters - P and M - and a small ROM (Read 
Only Memory) control the operation of ttie microco-- 



cutt 

Counter P is the power counter for counting the 
number of encryptkin key bfts processed. 

Counter M Is the multiplier counter for counting 
5 the number of multiplier bits used. 

Rg. 1 , as stated hereinbefore, is the flow chart for 
calculating C=M%iod(N), as described in the cited 
prior art The various stages represented on the flow 
chart carry out the following operatfons. 
10 At 1 1 ttie circuit is initialized 

- Power counter P is loaded with the encryption 
key length. 

- Multiplier counter M is reset to zero. 

- Register E is foaded with the encryption key. 
IS - Register N is loaded with the nKxfulo number. 

- Register M is loaded witti the message. 

- Register S is reset to zero. 

- Register C is set to binary value of 1. 

At 12,13: If the power counter was not set to the 

20 correct power length, a normalizing process takes 
place. Register E is shifted to the left until a binary "1" 
is located in its MSB (Most Signiflcant Bit) position. In 
additk)n, for each shift the power counter is decre- 
mented by one. 

25 At 14: The squaring of the content of register C is 

perfomned. Further detaDs of the multiplication pro- 
cess are described in Figs. 2 and 3. 

At 15: Depending on the MSB bit of register E, 
stage 16 is performed or not 

30 At 16: The modular multiplication of registers C 

and M is performed. 

At 1 7: The shifting of register E is performed. Con- 
sequentiy the next E bit is positioned at the MSB posi- 
tion, and according to its value the next (C*M)mod(N) 

35 will be performed. In addition, counter P is decremen- 
ted. 

At 18: If counter P reached its final state, the 
encryption process is terminated. 

The flow of this diagram is implemented by using 
40 a state nnachine based on a ROM. This stete machine 
supplies all ttie control signals to the whole drcuit 
The description of each control signal is deteiled in the 
fdlowing figures. 

Fig. 2 describes the flow chart for perfomiing the 
45 serial multiplication process using one adder. 

At 20: The content of register S fe loaded into regi- 
sters C and B, and S is reset to zero. 

At 22: Ettier ttie M register MSB (Least Signific- 
ant bit), while performing (M*C)nK)d(N), or ttie LSB bit 
50 in register C, if perfomning (C*C)mod(N), determines 
whether to perform stages 23, 24, 25) or not 

At 23: The first addition of ttie contents of regis- 
ters S and B. The result goes back to register S. (S 

55 At 24, 25: Here the second addition takes place. 

The moduk) reduction is performed where N is sub- 
tracted from the (S+B) result, which is stored back in 
registers. The result S or (S-N) stored back in register 
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S d^mte on the carry bit value. 

At 26, 27, 28: Performing ttie third addition, regi- 
ster B is shifted to the left, thus muttiplying it t>y 2. 
Then N is subtracted from it Again, the result 2B or 
(2B - N) is stored back in register B depending on the 
carry. 

The operations S-N and 2B-N ensure that no 
vatue stored in registers S and B during the entire 
modular nrmltiplication operation ever exceeds the 
nrxxtuto N. 

At 29: The M counter is incremented by 1 . 

At 30: The M counter is tested to see (f it reached 
the final count 

The flow chart Rg. 3 describes a irwltiplier for 
obtaining the same result using two parallel adders. 

The purpose of having two parallel adders ( and 
later, having three parallel adders) is to expedite the 
process (at the expense of adding hardware). 

At 31 : The content of register S is loaded into regi- 
sters C and B. and S is then rest to zero. 

At 32: The multiplying counter M is incremented. 

At 33: From this stage, two parallel processes are 
performed. 

At 34, 35. 36: Right branch - the content of regis- 
ter B is modulo multiplied by 2. 

At 37, 38. 39. 40: Left branch - According to the 
multiplicanrfs LSB bit. in register M or C the modulo 
addition of S and B is performed. 

TTie two additions requbed for performing S = 
(S-^B) and B = 2B - N are done in parallel using the 
two adders, while the third addition, which requires 
the result of S + B for performing S = (S+B) - N is done 
in series after the result of S = S+B is stored back in 
register S. 

At 41: Counter M is checked to see if it reached 
its final count 

The flow chart of Fig. 3 also describes a multiplier 
using three parallel adders. 

TTie only difference, with respect to the case of 
two adders is that during stages 37. 38. 39, 40. all the 
additions are operated simultaneously in one dock 
cyde. using the three adders. 1) S + B 2) S - N 3)2B 
- N. In both additions 2) and 3), N is subtracted or not 
according to the carry bit 

Fig. 4 illustrates a cell, according to an embodi- 
ment of the invention. All the 1 3 signals are generated 
in a contrd block. The cell contains the fdbwing six 
registers: 

51) E-Hip-flop (FF) part of the left shifting E regi- 
ster. 

52) M:FF out of the right shifting M register. 

53) C: scan FF out of the right shifting C register. 
It can also be loaded from the S regeter. 

54) S:FF ¥vith a 3 1 mux on its data input It 
selects one of three possible inputs: a) the addi- 
tion result; b) the prevtous S; and c) shift right 
option (used for loading and unloading of data). 

55) B:FF with a 4 -^1 multiplexer on its data input 



The mux selects one of four possit^e inputs: a) B 
= B; b) B = 2B; c) B = 2B - N; and d) B = S. 
56) N: FF out of the right shifting N register. 
TTie outputs toward the adders are: a)B; b)2B; 
5 c)N; d)S. The input SUM is the sum result of the addi- 
tion. 

The control bus (1 3 signals wide - CON[0: 1 2] sup- 
plies all the control signals to the elements in the cell. 

The signals connected to the cell adjacent on the 
10 left are nrarked EL. ML. CL. SL. BL, NL. 

The signals connected to the cell adjacent on the 
right are marked ER, MR. CR. SR. BR, NR. 

The signals connected to the adder are on the 
bottom of the cell. The signals S. B. N. 2B are one bit 
IS slice of the operands, that the adder uses, in particular 
for the RSA algorithm on normal reg'ster operands, 
for executing: S+B. S-N. 2B-N. The SUM input is the 
adder's result 

Element 51 is one F.F of an L bit tong left shifting 
20 register (E). Its date input is ER and output EL. Signal 
CONO is the dock signal, which advances the shift 
register. 

Element 52 is one F.F of an bit long right shifting 
reg'^ter (M). Its date input is ML and output MR. Signal 
25 CON1 is the dock signal, which advances the shift 
register. 

Element 53 is one F.F with a 2 to 1 multiplexer on 
ite date input out of an L bit long right shifting register 
(C). Signal C0N2 selects which input wil be k>aded 
30 to the F.F: 

a) CL - output from adjacent cell on the left 

b) S - value in register S. 

Signal CON3 is the dock which loads the selected 
date into the F.F. Signal CON4 resets the register. 

35 Element 54 is one F.F with a 3 to 1 multiplexer on 

ite date input out of an L bit tong right shifting register 
(S). Signals CON5 and CON6 select which input will 
be loaded to the F.F: 
a) S - previous value 

40 b) SUM - adder's result 

c) SL - output from adjacent cell on the left 
Signal CON7 is tiie dock which loads the selected 
date into ttie F.F. Signal CON8 resets tiie register. 

Elenoent 55 is one F.F with a 4 to 1 multiplexer on 
45 Its date input out of an L bit long shifting register (B). 
Signals CONIOand CON 11 select which input will be 
loaded to the F.F: 

a) B - prevKHJS value. 

b) S - content of register S. 

50 c) BR - output from adjacent cell on the right 

d) SUM - adder's result 

Signal CON9 is ttie dock whk:h loads the selected 
date into ttie F.F. Signal CON4 resets ttie register. 
Element 56 is one F.F of an L bit long right shifting 
55 register (N). Ite date input is NL and output NR. Signal 
C0N12 is the dock signal, which advances ttie shift 
register. 

The uniqueness of ttie complete solution lies in 
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this ce!l. which enables to perfomi the complete 
algorfthm, with all the connection between the regis- 
ters to be limited to wfthin the ceD. Handcrafting the 
layout of this cell into small sitcon area and making 
sure that the inputs on the left side of the cell match 
the outputs of the cell on the left, and having the same 
for the right side. wBl make the task of generating a 
complete L bit solution very compact and straightfor- 
ward. Stmdarly. connect»ns to the adder and feed- 
back to the accumulatiF^ registers are effidentty 
enabled. 

Rg. 5 is similar to Fig. 4, but includes two 2 to 1 
multiplexers. The two multiplexers, controlled by sig- 
nals CON1 3 and CON14. select whk;h of the following 
additk>ns will be performed: 

a) S+B; 

b) S-N; 

c) 2B-N. 

The complete block diagram of Fig. 2, which exhi- 
bits a nrKxiular multiplication process, is implemented 
in F^. 5 as follows: 

The numbering of the following blocks refers to 
Fig. 2. The implementatfons refer to Fig. 5. 

Block 20 is implemented by having the result of 
the previous nx>dular multiplication process which 
resides in register S, move to both registers B and C. 
This is done prior to the resetting of register S. The 
move is done as follows: 

Signal CON2 selects the output of register S to be 
loaded into register C. 

Signals CON10 and CON1 1 select too the output of 
register S to be loaded into register B. 
During the next block cyde signal C0N9 resets regi- 
ster S. 

Block 22 is implemented by checking the right- 
most output of either register C or M. depending on 
whether a (CxC) mod N or (MxC) nrK>d N are required. 

Block 23 is implemented as follows: 
Signal C0N13 selects the contents of register S to 
output A. 

Signal C0N14 selects the contents of register B to 
output B. 

Signals CON5 and CON6 select the output of the 
adder (A+B) to be loaded into register S. 

Blocks 24 and 25 are implemented as folk>ws: 
Signal CON13 selects the contents of register S to 
output A. 

Signal CON14 selects the contents of register N to 
output B. 

(N is represented in its 2-complement form). 

Signals CON5 and C0N6. which are a function of 
the carry bit. wil select which result wfll be loaded into 
register S. (S-N) is loaded if S>N, and S is loaded if 
S^. 

Blocks 26, 27, 28 are implemented as follows: 
Signal C0N13 selects the contents of register B to 
output A. but by using the output of FF B from the cell 
on the right, instead of the FF in this cell, which gen- 



erates 2B. 

This oonskleratbn exemplifies one of the main 
approaches unique to the invention. That is. aiming 
the design toward cellular tmplementatk>n. 

5 Signal CON14 selects, as explained before, the con- 
tents of register N to output B. 
Signals CON10 and CON11, which are a functton of 
the carry bit, will select which result will be loaded. 
(2B-N) loaded if 2B>N, and 2B is loaded if 2B^. 

10 By adopting the above approach, the invention 
enables to implement within a celt, and by using one 
external adder, all the operations needed for a conv 
plete modular multiplication process. 

Fig. 6 describes the L-bit long solutk>n. The cells 

15 or cells, such as the cells of Fig. 5 indicated at 60. are 
connected in series, while the signals needed for the 
adder 61 are extracted from the bottom of each cell. 
This approach prevents large routing overhead, since 
all the signals are localized. Between the cells only six 

20 signals are transmitted, and tttese, with correct layout 
of the individual cells, require no routing. 

Fig. 7 describes a single-bit cell, wherein use is 
n^de of a CLA type adder. Numerals 51 to 56 desig- 
nate the same registers as in Figs. 4 and 5. and 62 and 

25 63 indicate two multiplexers. The cell indudes the 
necessary logic for producing the Generate and 
Propagate signals for the L-bit long CLA adder, and 
the cany logic. 

Rg. 8 illustrates the logk: implementatton of a 4 bit 

30 secondary modular unit with CLA adder. The drawing 
is self^xplanatory. 

Fig. 9 describes a 4-bit secondary nrK)dular unit, 
comprising four primary single bit modular units or 
cells 70 with all the interconnections between the 

35 adjacent cells and the 4-bit CLA block 71 . The inputs 
and outputs of this 4-bit unit are Mentical to the t/o of 
the single bit cell. 

If each cell 70 is substituted in Fig. 9 with a 4-bit 
secondary modular unit, a 16-bit tertiary unit is 

40 obtained. Analogous results are obtained by a similar 
substitutk>n in any diagram in which single-bit cells 
appear. If each single-bit unit is substituted with a 1 6- 
bit unit, a 64-bit unit is obtained and so on. Any des- 
ired multiple units can be built up in this way. 

45 Fig. 10 describes a 512-bit unit The design 

requires two 256-bit units 80 and a 2-bit CLA adders 
81. The carry block 82 stores the carry result of the 
addition for use in the next dock cyde. 

Fig. 11 describes the content of a single-bit cell 

50 while using the two adders solution. Registers 51 to 
56 correspond to those of Figs. 4 and 5. The 2 -> 1 
mux 83 selects which addition should be performed: 
a) S+B; or b) S-N. The third addition of 2B-N is done 
simultaneously with the first S+B addition. 

55 The description of Fig. 4 also holds here, except 

for a slight difference in which two additions are done 
simultaneously (2B-N and S+B), as shown in Fig. 3, 
and thus signals C0N13 only select whether to add 
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S-iB or S-N. Signal CON14 is not required. 

Fig. 12 describes the L-bit long solution whae 
using two adders 84 and 85. The cells nnay be the 
sanDe as units 60 of Rg. 6. 

Rg. 1 3 describes a cell embodying a 3 adder sol- 
ution. Regi^ers 51 to 56 nray be the same as those 
of Rgs. 4 and 5. Here no mux is required, since all the 
addition is performed sbnutaneousty. There ts no mux 
on the outputs toward the adder. 

The description of Fig. 4 also holcte here, except 
for a slight difference in which all the additions are 
done simultaneously as shown in Rg. 3. and thus sig- 
nals C0N13 and CON14 are not required. 

Rg. 14 describes the L-bit long solution using 
three adders 86, 87 and 88. The cells or cells may be 
the same cells 60 of Fig. 6. 

Rg. 15 describes a phase of operation of the 
microdrcuit according to the invention, under which a 
multiplication of two operands, each consisting of L 
bits, takes place. Such an operation is needed forvari- 
ous applications. The circuit operates in a well known 
manner, dear to persons skilled in the art 

Fig. 16 describes a phase of operation of the 
invented microcircuit under which the value J = -N-^ 
mod 2K where N is the nwdulus and L is the word size. 
ts calculated. The advantage in being able to calculate 
such a value, in order to implement a dhfisonless mod- 
ular multiplication of 2L-bit operands, based on 
Montgomery reduction, is dear to persons skOled in 
the art 

The drcuit of Fig. 16 is a generalization of that of 
Fig. 15. During one phase of operation, wtten the 
switches sw (both operate synchronously) are at the 
dotted position, the drcuit acts as the multiplier of Fig. 
15. When the switches sw are at their depk:ted (solkl) 
positk)n, the circuit generates the value of J, given N. 
Initially, the second bit from the right in S is set to 1 
(the rest are 0), Mis set to 0 and B stores N. The pro- 
cess starts with a shift N is added to the contents of 
S, and S shifts. This shift and add process continues, 
where N is added to the contents of S whenever the 
LSB of S is 1. After m shifts the contents of M is J. 

The shift is from the second place in S, since the 
bits whrch are dropped out from the least significant 
place are always 0, and should therefore be shifted 
into M before being changed by addition of N. Another 
possibility for generating the bits in M would be to 
sense the bit at the least significant place in S, before 
additk>n takes place, and shift it into M. In this case the 
initial state of S is 1 as an LSB, and the process starts 
by sensing this 1 and shifting it into M, before addftk}n 
of N takes place. 

By use of the drcuit of Fig. 16, J is calculated by 
an operation equivalent to one multiplication. 

A possible variation: Having an initial all 0 con- 
tents of S and M. 

If, for convenient design conskJerations, it is pre- 
ferred to have an all 0 contents for the accumulator; 



this can be achieved by having two 'nof gates in the 
places marked by * in F^. 16. The rest of the drcuit 
remains unchanged. The introduction of the 'nof ga- 
ves may also fadlOate the in^)lementation due to pure 

5 technological considerations. Note that in this case, N 
is added to the contents of S for the purpose of 
generating I's at the LSB, rather than Os. 

Rg. 17 describes a phase of operation of the 
microdrcuit according to the invention, under which 

10 2L-bit value is divide by L-bit value. The double-preci- 
ston value X2X1 is stored in registers B, E, which shift 
together to the left. The divisfon operation follows the 
line in which before each shift B is replaced by B-N if 
B-N is positive. Here we exploit the feature offered by 

15 the microdrcuit according to the invention, enabling 
the replacement of 2B by 2B or 2B-N, depending on 
the sign bit of 2B-N. 

However, the operation offered by the engine is 
2B-N , performed after a shift, whereas we need B-N 

20 to be perfonmed before a shift (Note also that if X is 
the contents of B. then after a shift the contents is 2X 
+ E(n) and not purely 2X, where E(n) denotes the MSB 
of E, shifted into B). In order to handle this situation, 
the inital contents of the concatenation B.E w9l be the 

25 double precision dividend shifted one place to the 
right (That fe, the MSB in B is 0). The LSB of E is 
stored temporarily in a tnifFer and is shifted into E dur- 
ing ttie first shift The operation 2B-N performed by the 
microdrcuit according to the invention after each shift 

30 is then actoally B-N performed before a shift. L-t-l 
shifts are however required. The decisk>n on whether 
to replace B by B-N is done by the original flag. 

This operation results a pure division, where the 
bit qi of the quotient is 1 or 0 depending respectively 

35 on whether N is subtracted at step i or not That is, qi 
is the flag which deddes whether 2B is replaced by 
2B-N or not The bits of the quotient are generated 
storting witii the MSB. They can be shifted back into 
E, which is emptied from its left, while being filed in 

40 from the right 

Based on the drvisk)n operation described above 
it would be possible to perform the operation A B nnod 
C for 2L-bit operands. This way, ttie drcuit can pro- 
cess operands whose length is twice that of the indi- 

45 vkfual registers . For example, for L + 512, the drcuit 
can perform a modular multiplication of 1024-bit 
operands. Referring to the following process, the per- 
fonmed operation is (AiAo) (BiBo) nrrad (N^Nq). where 
Ti is an L-bit value. (It is assumed ttiat the two multip- 

50 licands are smaller than the nuxlulus). 

Notation 

XiXpOc or YXi means a concatenation. 
55 Xi-Xj or Y Xi means multiplication. 

If Y is obtained from a division of two values then 
Yq denotes the quotient part of Y. 

Preliminary step: Shift N to the left until its MSB 
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1. Denote by i the minit>er of shifts. Shift one of ttte 
multiplicands (say A) I places to the left N^No and 
AiAo refer to N and A after the shifL (Note that since 
A < it is impossltde that shifting A i places to the left 
results an overflow.) 

Stepi 

X = Ai B. 

Y = X2X,:Ni. 

Z = X - Yq N. If Z is negative then Z = 2 + N. 
Si=Z-10 

Step 2 

X = Si + Aq B. Cout denotes the value of the carry 
generated beyond a three character resuK. 
XjXi = CoutXaXi - Cout N 

Y = X2Xt:Nt. Cout denotes the value of the carry 
generated t>eyond a one character Yq. 

X2X1 = X2X1 - Cout N, Yq = Yq - Cout 10. 
Z = X- YqN. 

If Z is negative then Z = Z + N. 
If Z is negative then Z = Z + N. 
If Z is negative then Z = Z N. 
Shift Z i place to the right 

Z = A B mod N. 

An aitemative process 

Preliminary step: N and A are moved as before. 
Step 1 

X = A1 B. 

Y = X2Xi:Ni. 

Z = X - Yq N. If Z is negative then Z = Z + N. 



the execution of XaX^Ni: 

a) Concerning the L+1 shifts executed during the 
divsion of L-bit operands: The operation 2B-N 
performed after the first shift is X2 - N,. When thfe 

5 operation is performed in Step 1 , it is ensured that 

X2 < N,. and the first generated qi is the second 
Cout 

b) A possibility for doing without the connection 
fonm register E to B: Yq was defined as ttte quo- 

10 tient obtained from the operation Y =X2Xi:Ni. Let 
Yq' denote the quotient obtained from the oper- 
ation X20:Ni. That is, Xi is replaced by Os. Yq* 
generated by register B. with O's shifted into it 
Here we can implement the original functioning of 

15 B, that is, B is replaced by 2B - N or not. depend- 

ing on ttie sign of 2B-N. Yq' = Yq or Yq' = Yq - 1 , 
depending on whether Xt < Ni or not If we prefer 
to use Yq', thus saving the connection of registers 
E to B, Steps 1 and 2 should be modified. The 

20 operation Z = X - Yq N should be extended to: 

Z = X - Yq N. If Z > N then Z = Z - N. 

c) Conceming ttie operation Yq N: 

Yq resides in register E which is a left shift register. 
When multiplying Yq by a number, Yq is shifted out of 

25 E starting with its MSB. This could cause some dif- 
ficulties in the multiplication. However, shifting the 
contents of E into B (which justifies again the connec- 
tion <^ E) soh^es this problem, since the engine natur- 
ally handles a multiplication in which one of the 

30 multiplicands is in B. 

While a number of embodiments of the invention 
have been described byway of Olustration, it is under- 
stood that the invention may be carried out in many 
other ways by persons skilled in tiie art, whithout 

35 departing from its spirft or from the scope of the 
claims. 



Step 2 

40 

Y = Z:Ni. 

Z = ZO - Yq N. If Z is negative ttien Z + Z + N. 
Steps 

45 

X = Z + AoB. 

Y = X2X1 :N V Cout denotes the value of the carry 
generated beyond a one character Yq. 

X2X1 = X2X1 - Cout N, Yq = Yq - Cout lO. 

Z = X-YqN. 50 

If Z is negative then Z = Z + N. 

If Z is negative then Z = Z + N. 

Shift Z i places to the right 

Z = A B mod N. 

The division operation used in the process which 55 
is facilitated by the circuit of Rg. 17, is X2Xi:Ni. 

Some further points conceming the unique feat- 
ures of ttie circuit according to the invention, enabling 



Claims 

1 . Microcircuit means for the implementation of RSA 
encryption and decryption transformations, 
characterized in tiiat it consists of a plurality of 
units, to which external adders are attached, 
thereby incorporating the RSA function. 

2. Microcircuit means according to daim 1 , wherein 
each of the cells comprise six 1-bit memory ele- 
ments, each being a slice of an L-bit shift register. 

3. Microcircuit means according to daim 1, wherein 
all communication between registers are perfor- 
med within a cell. 

4. Microdrcuit means according to daim 1 , wherein 
the modular units are secondary units, which 
comprise a number of cells and an adder. 
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5. Microdrcuitmeans according to daiml.oornpri^ 
ing a number N of niultiple units acccMtling to 
daim 3. 

6. Microdrcuit means according to daim 1 , wherein s 
the modular units additionally comprise suitable 
adders. 

7. Microdrcuit means according to daim 5, wherein 
there are 1, 2 or 3 adders. io 

8. Microdrcuit means according to daim 3, wherein 
from 1 to 3 adders are provided for each "n" nun>- 
t>er of cells. 

15 

9. Microdrcuit means according to daim 3, wherein 
from 1 to 3 adders are provided for each "N' num- 
ber of multiple units. 

10. Microdrcuit means according to any one of the 20 
preceding daims, comprising CLA adders. 

11. Microdrcuitmeansforthe implementation of RSA 
encryption and decryption transformations for 
operands whose size is 21^ twice that of the indi- 25 
vidua! registers, tyy a method chosen among 
Montgomery reduction and modular multipli- 
cation of 2L-bit operands by the use of L-bit div- 
ision drcurtry according to any of the preceding 
datms. 30 

1Z Microdrcuit means according to daim 11, whe- 
rein the Montgomery reduction is implemented by 
the calculation of J = -N-^ mod 2\ where N is the 
modulus and L is the word size. 35 
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